-
Notifications
You must be signed in to change notification settings - Fork 903
bugfix: Setting OMPI_MPI_THREAD_LEVEL to a value different than requested crashes #13211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
bugfix: Setting OMPI_MPI_THREAD_LEVEL to a value different than requested crashes #13211
Conversation
b71461f
to
b481893
Compare
ompi/mpi/c/init_thread.c.in
Outdated
*/ | ||
if (NULL != (env = getenv("OMPI_MPI_THREAD_LEVEL"))) { | ||
int env_required = atoi(env); | ||
err_arg_required = err_arg_required || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
env_arg_required |= ...etc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My biggest concern here is that the user is required to know the valid thread level values as defined in the MPI ABI header. We should keep doing what we do right now, aka 0-3, and use a shift operation to convert it to the ABI value env_required = (1 << (10 + atoi(env)));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now, the valid values are still [0-3] (because our internal ABI is not the same as the standardized ABI (yet?), I'll add this consideration to the comment requested by Jeff so we don't forget when the time comes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW: we plan to support both the existing OMPI ABI and the official MPI Forum MPI ABI. Hence, we'll probably want code that can handle both situations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the important part is that we don't want the users to have to enter the ABI values for the environment variable for the thread level. We want this to remain indexed as it is today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. Good point.
Maybe we should also accept strings? MPI_THREAD_SINGLE
, MPI_THREAD_MULTIPLE
, ... etc. I.e., something that doesn't depend on a numeric value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we defer this to when we implement ABI compat?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might behoove us to do this now, so that it can be in 6.0.0 -- i.e., at a major version change. ABI may or may not make it into 6.0.0 (e.g., it won't be ready until -- at the earliest -- the end of this summer, and we may have already released 6.0.0 by then, in which case, ABI would be in 6.1.0).
It can't be that hard to implement to also check for string values. I'd suggest the following case-insensitive values:
mpi_thread_single
andsingle
mpi_thread_serialized
andserialized
mpi_thread_funneled
andfunneled
mpi_thread_multiple
andmultiple
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides, didn't you make the argument that you want to check for the individual values because of ABI? You can't argue that we should do this PR because ABI is coming, but we shouldn't bother doing all the things because ABI isn't here yet. 😇
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its done
b481893
to
1a0d2f7
Compare
Hello! The Git Commit Checker CI bot found a few problems with this PR: 1a0d2f7: review comments
Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks! |
8ad7579
to
fe5d20e
Compare
Hello! The Git Commit Checker CI bot found a few problems with this PR: b056c23: OMPI_MPI_THREAD_LEVEL can now take 'multiple' 'MPI...
Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks! |
b056c23
to
5059134
Compare
Hold on a little on this one, I have a fancier solution I just didn't had the time to integrate it. The code is attached below, and it allows for partial matching for as long as the match is unique. As an example "s" will not be accepted because int check_env_value(const char** valid_prepositions, const char** keywords, int nb_keywords, const char* value)
{
char *prep = NULL, *token = (char*)value /* full match */;
int pidx = 0, v = strtol(value, &prep, 10), found = -1;
if ((0 == v) && (prep != value))
return v;
while(NULL != (prep = (char*)valid_prepositions[pidx])) {
if( 0 == strncasecmp(prep, value, strlen(prep)) ) {
token = value + strlen(prep);
break; /* got a token let's find a match */
}
pidx++;
}
for(int i = 0; i < nb_keywords; i++) {
if( 0 == strncasecmp(keywords[i], token, strlen(token)) ) {
if( -1 != found ) { /* not the first match, bail out */
return -1;
}
found = i;
}
}
return found;
}
static const char* keywords[] = {"single", "serialized", "funneled", "multiple"};
static const char* prepositions[] = {"mpi_thread_", "thread_", NULL}; |
I think you have perms to push in my branch if you want to do that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we're going to need to get this documented, too. It's cool new functionality, but if it doesn't appear in a man page and/or some other docs, no one will ever use it.
`requested` in `MPI_Init_thread` would invoke the error handler, even though it is an useful override in some threaded library use cases. Signed-off-by: Aurelien Bouteiller <[email protected]>
(single,etc) in addition to numeric 0-3 values Signed-off-by: Aurelien Bouteiller <[email protected]>
This function support prepositions (such as mpi_thread_) and partial matching (such as "fun" for funnelled). Signed-off-by: George Bosilca <[email protected]>
54682e4
to
0f4673c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took the liberty of updating the MPI_Init* man pages. You're welcome. 😇
I made minor changes to some MPI_Session_* man pages, too, but most changes were to
0f4673c
to
027fda5
Compare
…ages Including, but not limited to: * Added much more description of and distinction between the MPI world model and the MPI session model. Updated a lot of old, pre-MPI-world-model/pre-MPI-session-model text that was now stale / outdated, especially in the following pages: * MPI_Init(3), MPI_Init_thread(3) * MPI_Initialized(3) * MPI_Finalize(3) * MPI_Finalized(3) * MPI_Session_init(3) * MPI_Session_finalize(3) * Numerous formatting updates * Slightly improve the C code examples * Describe the mathematical relationship between the various MPI_THREAD_* constants in MPI_Init_thread(3) * Note that the mathematical relationships render nicely in HTML, but don't render entirely properly in nroff. This commit author is of the opinion that the nroff rendering is currently "good enough", and some Sphinx maintainer will fix it someday. * Add descriptions about the $OMPI_MPI_THREAD_LEVEL env variable and how it is used in MPI_Init_thread(3) * Added more seealso links Signed-off-by: Jeff Squyres <[email protected]>
027fda5
to
1e66d4c
Compare
It occurred to me overnight that the updates I made in MPI_Init*(3) and MPI_Finalize(3) were indicative of the fact that none of our man pages had been updated to reflect the fact that the world model now exists (and is distinct from the MPI session model). So I updated even more text this morning to describe and clarify the MPI world model vs. the MPI session model. I.e., in documenting the Reviewers should read these man pages:
|
Setting OMPI_MPI_THREAD_LEVEL to a value different than
requested
inMPI_Init_thread
would invoke the error handler, even though it is an useful override in some threaded library use cases.